Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adds support for the /v2/_catalog API #548

Merged
merged 10 commits into from
Oct 25, 2019
Merged

Conversation

jmakinen-ncc
Copy link
Contributor

No description provided.

@googlebot
Copy link

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://cla.developers.google.com/ to sign.

Once you've signed (or fixed any issues), please reply here with @googlebot I signed it! and we'll verify it.


What to do if you already signed the CLA

Individual signers
Corporate signers

ℹ️ Googlers: Go here for more info.

@jmakinen-ncc
Copy link
Contributor Author

CLA-wise I just use a different email for github than what I used to commit that

}

// GetCatalog calls /_catalog, returning the list of repositories on the registry
func GetCatalog(target name.Registry, options ...Option) ([]string, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that the results of catalog can be paginated, there might be a better return type than []string -- I'm not sure what the most go idiomatic thing to do here is...

We could return a channel? Or do callbacks for each "page"?

Or maybe this is just fine...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about now? (sorry to double comment)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd make sense for GetCatalog to do pagination itself, and return all the results it could find from however many requests were necessary to find them. That seems simpler than making callers deal with pagination, or pass a channel, or whatever.

@codecov-io
Copy link

codecov-io commented Sep 20, 2019

Codecov Report

Merging #548 into master will increase coverage by 0.52%.
The diff coverage is 41.46%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #548      +/-   ##
==========================================
+ Coverage   72.34%   72.87%   +0.52%     
==========================================
  Files          95      102       +7     
  Lines        4245     4486     +241     
==========================================
+ Hits         3071     3269     +198     
- Misses        777      806      +29     
- Partials      397      411      +14
Impacted Files Coverage Δ
pkg/crane/catalog.go 0% <0%> (ø)
pkg/v1/remote/catalog.go 68% <68%> (ø)
pkg/v1/google/auth.go 71.42% <0%> (-15.42%) ⬇️
pkg/authn/auth.go 100% <0%> (ø) ⬆️
pkg/v1/tarball/layer.go 77.77% <0%> (ø) ⬆️
pkg/authn/basic.go 100% <0%> (ø) ⬆️
pkg/authn/bearer.go 100% <0%> (ø) ⬆️
pkg/authn/helper.go
pkg/v1/remote/transport/logger.go 100% <0%> (ø)
... and 24 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 53e1ac5...7f18e32. Read the comment docs.

@googlebot
Copy link

CLAs look good, thanks!

ℹ️ Googlers: Go here for more info.

@jmakinen-ncc
Copy link
Contributor Author

How about now?


var query string

query = fmt.Sprintf("last=%s&n=%d", url.QueryEscape(last), n)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

query := fmt.Sprintf(...)


// GetCatalog calls /_catalog, returning the list of repositories on the registry
func GetCatalog(target name.Registry, last string, n int, options ...Option) ([]string, error) {

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: unnecessary blank line

return nil, err
}

//TKTK:JM iterate through results with "last"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this meant to be a TODO?

return parsed.Repos, nil
}

//TKTK:JM write tests
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, please do. 😄

return nil, err
}

parsed := catalog{}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

supernit: I typically see this as var parsed catalog which has the same effect but is infinitesimally more idiomatic.

@jmakinen-ncc
Copy link
Contributor Author

Thanks for the comments, back atcha

}

// GetCatalog calls /_catalog, returning the list of repositories on the registry
func GetCatalog(target name.Registry, last string, n int, options ...Option) ([]string, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you feel about GetCatalog just doing pagination itself and returning all the values it can find? How many pages of results do we expect to get from this? Tens of thousands?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's why I was hesitant to implement it because I'm not sure of the scale.
I was planning on writing a helper for my own purposes that I can add later, but I just needed this at a minimum.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The scale might be a concern, but for (most?) users who should expect only a couple pages, it sucks that this makes them handle pagination themselves.

I'd prefer to expose the pagination-handling version for the users that don't want to deal with it, and maybe if we hear that this method is a pain for whatever scaling reason, we can export the DIY method later. If we never hear that we need to expose the guts, that's great.

Does that make any sense?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're going to handle the pagination internally, we should probably make GetCatalog take a context.Context so that you can cancel it.

it sucks that this makes them handle pagination themselves

This is basically why the crane package exists. IMO we should (within reason) expose all the complexity that the API itself has, then we can expose the happy path in crane.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another pattern I've seen that is nice, is to have the GetCatalog take a callback function that gets called per page. It allows users to abort early (via returning an error / cancelling the context), but it hides the pagination detail from the user.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@imjasonh thinks they're not idiomatic go. There's this presentation for inspiration.

I'm not sure if we could borrow something from that, but it also doesn't like callbacks (from here):

Bryan briefly mentions asynchronous callbacks as something programmers from other certain languages sometimes try to use. But he notes that most Go programmers already know not to use them in Go, and moves on to talking about Futures and Producer–Consumer Queues.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a synchronous callback - not a async one. The next page wouldn't be fetched until the callback returns.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, how do we feel about my new crane.GetCatalog?

@jonjohnsonjr
Copy link
Collaborator

@jmakinen-ncc so, let's just punt on figuring out a reasonable API for paginated requests. I'd say just add your original, easy-to-use implementation under pkg/crane for now. We can come up with a better way to expose pagination later, and we'll actually want to do that for listing tags as well. Filed #549 to track that.

Sorry for the back and forth!

@jmakinen-ncc
Copy link
Contributor Author

Sorry, so does that mean that this can be merged?

}

// GetCatalogPage calls /_catalog, returning the list of repositories on the registry
func GetCatalogPage(target name.Registry, last string, n int, options ...Option) ([]string, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the idea is to just make this a private method in the crane/catalog.go file. At some point in the future it can get moved to the remote library when we have interface concensus.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'm actually okay with leaving this in. If we ever come up with a good API for this, we can name it GetCatalog, and GetCatalogPage would still be useful.

I am a little bit unhappy with this because it doesn't allow clients to adhere to the spec:

Compliant client implementations should always use the Link header value when proceeding through results linearly.

However, this does enable you to skip ahead, which is immediately after:

The client may construct URLs to skip forward in the catalog.

I think we're basically making it impossible (sometimes, depends on implementation) for a registry to implement this efficiently by handing us a cursor in the Link header, but maybe we don't care.

@jmakinen-ncc
Copy link
Contributor Author

Would anyone be able to help me figure out why the build is failing here?
do I need to write docs for the added command or something?
Also, I feel like keeping the most generic interface in the remote package and then having the pretty one in crane makes more sense right?
So that if anyone wants to do the dirty work of working with pages directly but still having the benefits of having auth and everything handled then they can. I would think, we'd want remote to be the most 1:1 representation of the API possible
but if someone just wants a full list without having to think, they can call the user-friendly crane

@jonjohnsonjr
Copy link
Collaborator

Would anyone be able to help me figure out why the build is failing here?
do I need to write docs for the added command or something?

It's complaining that you haven't run ./hack/update-code-gen.sh to generate the docs. We could probably make that more evident.

Also, I feel like keeping the most generic interface in the remote package and then having the pretty one in crane makes more sense right?
So that if anyone wants to do the dirty work of working with pages directly but still having the benefits of having auth and everything handled then they can. I would think, we'd want remote to be the most 1:1 representation of the API possible

Yeah, definitely. I'm just not sure if this is the API we want to support forever. It has the same drawback as this, where we can't reuse the auth handshake results across multiple pages. I'm also not sure if we want to return just the []string or the whole struct. There's some other stuff we're not really exposing here (e.g. the Link header, which seems like we should respect it as a client, and there's a next field in the response there which seems like a bug in the spec actually).

On the other hand, the crane API is pretty straightforward, so I think it's reasonable to merge that now (so you can use it) until we can figure out the best way to represent this. Do you need the ability to list individual pages? Or is the crane API sufficient for now?

// NewCmdGetCatalog creates a new cobra.Command for the repos subcommand.
func NewCmdGetCatalog() *cobra.Command {
return &cobra.Command{
Use: "repos",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd rather this just be catalog than repos

@jonjohnsonjr
Copy link
Collaborator

@jmakinen-ncc any interest in pushing this over the finish line? I feel like it's pretty close to merge-able.

@jmakinen-ncc
Copy link
Contributor Author

jmakinen-ncc commented Oct 18, 2019 via email

@jonjohnsonjr
Copy link
Collaborator

sorry

No worries! I was just running through old issues and PRs trying to clean things up, there's no rush 😄

@jmakinen-ncc
Copy link
Contributor Author

Do we have a winner?

@@ -0,0 +1,22 @@
## crane repos
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: drop this file

func init() { Root.AddCommand(NewCmdGetCatalog()) }

// NewCmdGetCatalog creates a new cobra.Command for the repos subcommand.
func NewCmdGetCatalog() *cobra.Command {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I'd name this "NewCmdCatalog" to be consistent with the rest of the commands.

"github.com/google/go-containerregistry/pkg/v1/remote"
)

// GetCatalog returns the repositories in a registry's catalog
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

supernit: add a period at the end of this sentence

n := 100
last := ""
for {

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

supernit: drop this line

if len(page) < n {
break
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

supernit: drop this line

Repos []string `json:"repositories"`
}

// GetCatalogPage calls /_catalog, returning the list of repositories on the registry
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

supernit: add a period at the end of this sentence

}

// GetCatalogPage calls /_catalog, returning the list of repositories on the registry
func GetCatalogPage(target name.Registry, last string, n int, options ...Option) ([]string, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'm actually okay with leaving this in. If we ever come up with a good API for this, we can name it GetCatalog, and GetCatalogPage would still be useful.

I am a little bit unhappy with this because it doesn't allow clients to adhere to the spec:

Compliant client implementations should always use the Link header value when proceeding through results linearly.

However, this does enable you to skip ahead, which is immediately after:

The client may construct URLs to skip forward in the catalog.

I think we're basically making it impossible (sometimes, depends on implementation) for a registry to implement this efficiently by handing us a cursor in the Link header, but maybe we don't care.

responseBody: []byte("notjson"),
wantErr: true,
}}
//TODO: add test cases for pagination
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Planning to do this later? 😄

supernit: add a space between // and TODO

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potentially, but I just wanted to note that they are missing if anyone runs into issues there later and wants to write some.
Do you need this?

Copy link
Collaborator

@jonjohnsonjr jonjohnsonjr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@jmakinen-ncc
Copy link
Contributor Author

Thanks everyone!

Copy link
Collaborator

@jonjohnsonjr jonjohnsonjr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jmakinen-ncc sorry I don't mean to renege, but I just saw these final nits right before I hit merge 😅

The nits mean we care ❤️


func init() { Root.AddCommand(NewCmdCatalog()) }

// NewCmdGetCatalog creates a new cobra.Command for the repos subcommand.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: NewCmdCatalog

"github.com/google/go-containerregistry/pkg/v1/remote"
)

// GetCatalog returns the repositories in a registry's catalog.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can You change these to just Catalog?

Repos []string `json:"repositories"`
}

// GetCatalogPage calls /_catalog, returning the list of repositories on the registry.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can You change these to just CatalogPage?

"github.com/google/go-containerregistry/pkg/name"
)

func TestGetCatalogPage(t *testing.T) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TestCatalogPage

@@ -0,0 +1,45 @@
// Copyright 2018 Google LLC All Rights Reserved.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: 2019

@@ -0,0 +1,49 @@
// Copyright 2018 Google LLC All Rights Reserved.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: 2019

@@ -0,0 +1,70 @@
// Copyright 2018 Google LLC All Rights Reserved.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: 2019

@@ -0,0 +1,83 @@
// Copyright 2018 Google LLC All Rights Reserved.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: 2019

…ity in!(no worries if I'm still not there tho)
Copy link
Collaborator

@jonjohnsonjr jonjohnsonjr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(no worries if I'm still not there tho)

😆

git is my favorite chat protocol

Thanks! Sorry again for all the back and forth :)

@jonjohnsonjr jonjohnsonjr merged commit f9947dc into google:master Oct 25, 2019
@jmakinen-ncc
Copy link
Contributor Author

git is my favorite chat protocol

Ask @clrprod about "Big changes, no promises"

jonjohnsonjr added a commit to jonjohnsonjr/go-containerregistry that referenced this pull request Nov 15, 2019
jonjohnsonjr added a commit that referenced this pull request Nov 15, 2019
* Revert "adds support for the /v2/_catalog API (#548)"

This reverts commit f9947dc.

* Re-apply catalog changes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants